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- The MAILING DATE of this communication appears on the cover sheet with the correspondence address — 
Period for Reply 

A SHORTENED STATUTORY PERIOD FOR REPLY IS SET TO EXPIRE 3 MONTH(S) FROM 
THE MAILING DATE OF THIS COMMUNICATION. 

- Extensions of time may be available under the provisions of 37 CFR 1.136(a). In no event, however, may a reply be timely filed 
after SIX (6) MONTHS from the mailing date of this communication. 

- If the period for reply specified above is less than thirty (30) days, a reply within the statutory minimum of thirty (30) days will be considered timely. 

- If NO period for reply is specified above, the maximum statutory period will apply and will expire SIX (6) MONTHS from the mailing date of this communication. 
• Failure to reply within the set or extended period for reply will, by statute, cause the application to become ABANDONED (35 U.S.C. § 133). 

Any reply received by the Office later than three months after the mailing date of this communication, even if timely filed, may reduce any 
earned patent term adjustment. See 37 CFR 1 .704(b). 

Status 

1)S Responsive to communication(s) filed on 20 March 2003 . 
2a)D This action is FINAL. 2b)IE This action is non-final. 

3) ^ Since this application is in condition for allowance except for formal matters, prosecution as to the merits is 

closed in accordance with the practice under Ex parte Quay/e, 1935 CD. 11, 453 O.G. 213. 

Disposition of Claims 

4) ^ Claim(s) 1-30 is/are pending in the application. 

4a) Of the above claim(s) is/are withdrawn from consideration. 

5) D Claim(s) is/are allowed. 

6) I3 Claim(s) 1-30 is/are rejected. 

7) D Claim(s) is/are objected to. 

8) D Claim(s) are subject to restriction and/or election requirement. 

Application Papers 

9) D The specification is objected to by the Examiner. 

10) D The drawing(s) filed on is/are: a)D accepted or b)D objected to by the Examiner. 

Applicant may not request that any objection to the drawing(s) be held in abeyance. See 37 CFR 1 .85(a). 
Replacement drawing sheet(s) including the correction is required if the drawing(s) is objected to. See 37 CFR 1.121(d). 

1 1) D The oath or declaration is objected to by the Examiner. Note the attached Office Action or form PTO-152. 

Priority under 35 U.S.C. § 119 

12) D Acknowledgment is made of a claim for foreign priority under 35 U.S.C. § 1 1 9(a)-(d) or (f). 
a)D All b)D Some * c)D None of: 

1 .□ Certified copies of the priority documents have been received. 

2.Q Certified copies of the priority documents have been received in Application No. . 



3.D Copies of the certified copies of the priority documents have been received in this National Stage 
application from the International Bureau (PCT Rule 17.2(a)). 
* See the attached detailed Office action for a list of the certified copies not received. 
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DETAILED ACTION 



1. This action is responsive to communications: original application filed 7/26/2001, said application is a 
CIP of application 09/294,701 filed 4/19/1999 (allowed, not yet issued). 

2. Claims 1-30 are pending. Claims 1, 16 are independent claims. 



Claim Rejections - 35 USC § 112 

3. The following is a quotation of the second paragraph of 35 U.S.C. 1 12: 

The specification shall conclude with one or more claims particularly pointing out and distinctly claiming the subject 
matter which the applicant regards as his invention. 

In regard to dependent claim 20, claim 20 recites the limitation "specified filtering criterion", and 
"The computer readable medium". There is insufficient antecedent basis for this limitation in the claim. It 
appears Applicant may have intended claim 20 to be dependent upon claim 19, since claim 19 appears to be the 
only claim providing proper support for the limitations in question regarding claim 20. The examiner's 
suggestion of changing claim 20 so as to be dependent upon claim 19 will overcome this rejection. 



Examiner's Note 

4. For the purpose of examination on the merits, the following rejections are based upon a possible 
interpretation of claim 20 as dependent upon claim 19. 



Claim Rejections - 35 USC § 101 

5. 35 U.S.C. 101 reads as follows: 



Whoever invents or discovers any new and useful process, machine, manufacture, or composition of matter, or any new 
and useful improvement thereof, may obtain a patent therefor, subject to the conditions and requirements of this title. 
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6. The claimed invention (as claimed in claims 1-15) are directed to non-statutory subject matter. 

In regard to independent claim 1, the combined limitations of claim 1 can be interpreted as a series of 
manual and/or mental steps, therefore said claim is directed towards non-statutory subject matter. The 
examiner's suggestion of changing the preamble of said claim to read "A computer executable method for. . . " 
will overcome this rejection. 

In regard to dependent claims 2-15, claims 2-15 are rejected for fully incorporating the deficiencies of 
their base claims. 

Claim Rejections - 35 USC § 103 

7. The following is a quotation of 35 U.S.C. 103(a) which forms the basis for all obviousness rejections set 
forth in this Office action: 

(a) A patent may not be obtained though the invention is not identically disclosed or described as set forth in section 102 
of this title, if the differences between the subject matter sought to be patented and the prior art are such that the subject 
matter as a whole would have been obvious at the time the invention was made to a person having ordinary skill in the 
art to which said subject matter pertains. Patentability shall not be negatived by the manner in which the invention was 
made. 

8. Claims 1-30 are rejected under 35 U.S.C- 103(a) as being unpatentable over Huck, G., et al. 
(hereinafter Huck), Jedi: extracting and synthesizing information from the Web, IEEE Cooperative 
Information Systems 1998, August 20-22, 1998, pp.32-41, in view of Weigel, A. et al. (hereinafter Weigel), 
Lexical postprocessing by heuristic search and automatic determination of the edit costs, IEEE Document 
Analysis and Recognition, 1995, August 14-16, 1995, pp.857-860. 
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In regard to independent claim 1, Huck teaches a method of wrapper generation (JED I) for extracting 
data from documents, said method comprising creating grammars for identifying patterns of symbols, said 
pattern containing prefix, value, and suffix patterns (Huck Abstract, also section 3 "Extraction Language", 
second column, especially "<strmgxblanksXnumberXblanksXnumber>"). It is noted that Huck relies upon 
a combination of pattern matching (as explained above) with grammars in its implementation. 

Huck teaches resolving ambiguities by exploring all possible solution, and then picking the best 
solution, therefore teaching identification and eventual selection of candidate matches, based upon ranking 
(Huck section 4 "Parsing Strategy", fifth paragraph from top of said section, see also section 7 "Conclusion and 
Further Research", especially second paragraph from top of said section). 

Huck does not specifically teach determining a "cost" associated with choosing a "best" ranked solution 
(candidate match). However, Weigel teaches automatic determination of edit costs (Weigel page 857, Title, 
section 1 "Introduction" - especially at top of second column, also page 859 section 6 "Learning the 
values of y"). It would have been obvious to one of ordinary skill in the art at the time of the invention to apply 
the edit costs of Weigel to Huck's ranking, providing Huck the benefit of increased accuracy in its ranked 
results. 

In regard to dependent claims 2, 3, Huck does not specifically teach edit distances. However, Weigel 
teaches Insertions, substitutions, and deletions, which are used in calculating edit distances (operations) (see 
Weigel page 858 sections 3 and 4, also page 859 - at top of first column; compare with claim 2). It would have 
been obvious to one of ordinary skill in the art at the time of the invention to apply Weigel to Huck, providing 
Huck the benefit of edit distance calculations for more accurate candidate selection. 

Huck teaches an example pattern string (Huck section 3 "Extraction Language" - second column; 
compare with claim 3). 
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In regard to dependent claims 4, 5, Huck teaches generation of all possible solutions (spans of 
interest), as well as subjecting patterns/grammars as filters (Huck section 3 "Extraction Language" second 
paragraph from bottom of said section). Huck also teaches an example implementation of its invention 
comprising the query (typically involving keywords) of extracted data (Huck section 5 "Example"). 

In regard to dependent claims 6, 7, 8, Huck does not specifically teach thresholds, or lowest cost 
selection. However, Weigel teaches thresholds, as well as calculations for determining lowest costs (W eigel 
page 859 section 5 "Speeding up the search"). It would have been obvious to one of ordinary skill in the art at 
the time of the invention to apply Weigel to Huck, providing Huck the benefit of edit distance calculations for 
more accurate candidate selection. 

In regard to dependent claims 9, 10, 11, Huck does not specifically teach adjustments, or weights, or 
addition. However, Weigel teaches calculations for determining costs, as well as adjustments and weights 
(Weigel page 859 section 5 "Speeding up the search", section 6 "Learning the values of y", also page 858 
section 3). It would have been obvious to one of ordinary skill in the art at the time of the invention to apply 
Weigel to Huck, providing Huck the benefit of various edit distance calculations for more accurate candidate 
selection. 

In regard to dependent claim 12, claim 12 incorporates substantially similar subject matter as claimed 
in claim 1, and is rejected along the same rationale. 

In regard to dependent claims 13, 14, Huck teaches generation of all possible solutions (spans of 
interest), as well as subjecting patterns/grammars (i.e. regular expressions) as filters (Huck section 3 "Extraction 
Language" second paragraph from bottom of said section). Huck also teaches an example implementation of its 
invention comprising the query (typically involving keywords) of extracted data (Huck section 5 "Example"). 
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In regard to dependent claim 15, claim 15 incorporates substantially similar subject matter as claimed 
in claim 1, and is rejected along the same rationale. 

In regard to claims 16-30, claims 16-30 reflect the computer readable medium comprising computer 
executable instructions used for performing the methods as claimed in claims 1-15, respectively, and are 
rejected along the same rationale. 



Conclusion 

9. The prior art made of record and not relied upon is considered pertinent to applicant's disclosure. 

Cams, AlwinB. U.S. Patent No. 5,890,103 issued 03-1999 

Hunter, Kenneth M. U.S. Patent No. 6,018,735 issued 01-2000 



10. Any inquiry concerning this communication or earlier communications from the examiner 
should be directed to William Bashore whose telephone number is (703) 308-5807. The examiner can 
normally be reached on Monday through Friday from 1 1 :30 AM to 8:00 PM EST. 

If attempts to reach the examiner by telephone are unsuccessful, the examiner's supervisor, 
Joseph Feild, can be reached on (703) 305-9792. 

Any inquiry of a general nature or relating to the status of this application should be directed to the 
Group receptionist whose telephone number is (703) 305-3900. 
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1 1 . Any response to this action should be mailed to: 

Commissioner of Patents and Trademarks 
Washington, D.C. 20231 

or faxed to: 

(703-872-9306) (for formal/after-final communications intended for entry) 

Hand-delivered responses should be brought to Crystal Park II, 2121 Crystal Drive, 
Arlington, VA, Fourth Floor (Receptionist). 



William L. Bashore 
Patent Examiner, AU 2176 
June 10, 2004 



